Correlations in Scale-Free Networks: Tomography and Percolation 
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We discuss three related models of scale-free networks with the same degree distribution but 
different correlation properties. Starting from the Barabasi- Albert construction based on growth and 
preferential attachment we discuss two other networks emerging when randomizing it with respect 
to links or nodes. We point out that the Barabasi- Albert model displays dissortative behavior with 
respect to the nodes' degrees, while the node-randomized network shows assortative mixing. These 
kinds of correlations are visualized by discussig the shell structure of the networks around their 
arbitrary node. In spite of different correlation behavior, all three constructions exhibit similar 
percolation properties. 

PACS numbers: 89.75.-k, 05.50.+q, 89.75.Hc 



INTRODUCTION 

Scale-free networks, i.e. networks with power-law de- 
gree distributions, have recently been widely studied (see 
Refs. [1,2] for a review). Such degree distributions have 
been found in many different contexts, for example in sev- 
eral technological webs like the Internet [3,4], the WWW 
[5,6], or electrical power grids [7], in natural networks like 
the network of chemical reactions in the living cell [8-10] 
and also in social networks, like the network of human 
sexual contacts [11], the science [12,13] and the movie ac- 
tor [14,15] collaboration networks, or the network of the 
phone calls [16]. 

The topology of networks is essential for the spread of 
information or infections, as well as for the robustness 
of networks against intentional attack or random break- 
down of elements. Recent studies have focused on a more 
detailed topological characterization of networks, in par- 
ticular, in the degree correlations among nodes [4,17-26]. 
For instance, many technological and biological networks 
show that nodes with high degree connect preferably to 
nodes with low degree [4,21], a property referred to as dis- 
assortative mixing. On the other hand, social networks 
show assortative mixing [17,25], i.e. highly connected 
nodes are preferably connected to nodes with high de- 
gree. 

In this paper we shall study some aspects of this topol- 
ogy, specifically the importance of the degree correla- 
tions, in three related models of scale-free networks and 
concentrate on the two important characteristics: the to- 
mography of shell structure around an arbitrary node, 
and percolation. 



THE MODELS 

Our starting model is the one of Barabasi and Albert 
(BA) [27], based on the growth algorithm with preferen- 
tial attachment. Starting from an arbitrary set of initial 



nodes, at each time step a new node is added to the net- 
work. This node brings with it m proper links which 
are connected to m nodes already present. The latter 
are chosen according to the preferential attachment pre- 
scription: The probability that a new link connects to 
a certain node is proportional to the degree (number of 
links) of that node. The resulting degree distribution of 
such networks tends to [28-30] : 



P(k) 



2m(m + 1) 



(1) 



k(k + l)(k + 2) 

Krapivsky and Redner [30] have shown that in the BA- 
construction correlations develop spontaneously between 
the degrees of connected nodes. To assess the role of such 
correlations we shall randomize the BA-network. 

Recently Maslov and Sneppen [21] have suggested an 
algorithm radomitzing a given network that keeps the de- 
gree distribution constant. According to this algorithm 
at each step two links of the network are chosen at ran- 
dom. Then, one end of each link is selected randomly 
and the attaching nodes are interchanged. However, in 
case one or both of these new links already exits in the 
network, this step is discarded and a new pair of edges 
is selected. This restriction prevents the apparearance 
of multiple edges connecting the same pair of nodes. A 
repeated application of the rewiring step leads to a ran- 
domized version of the original network. We shall refer 
to this model as link-randomized (LR) model. 

The LR model can be compared with another model 
which is widely studied in the context of scale-free net- 
works, namely with the configuration model introduced 
by Bender and Canfield [31,32]. It starts with a given 
number N of nodes and assigning to each node a num- 
ber ki of "edge stubs" equal to its desired connectivity. 
The stubs of different nodes are then connected randomly 
to each other; two connected stubs form a link. One 
of the limitations of this "stub reconnection" algorithm 
is that for broad distribution of connectivities, which is 
usually the case in complex networks, the algorithm gen- 
erates multiple edges joining the same pair of hub nodes 
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and loops connecting the node to itself. However, the 
cofiguration model and the LR model get equivalent as 
N -> oo. 

One can also consider a node-randomized (NR) coun- 
terpart of the LR randomize procedure. The only dif- 
ference to the link-radomized algorithm is that instead 
of choosing randomly two links we choose randomly two 
nodes in the network. Then the procedure is the same as 
in the LR model. 

As we proceed to show, the three models have differ- 
ent properties with respect to the correlations between 
the degrees of connected nodes. While the LR (config- 
uration) model is random, the genuine BA prescription 
leads to a network which is dissortative with respect to 
the degrees of connected nodes, and the NR model leads 
to an assortative network. This fact leads to considerable 
differences in the shell structure of the networks and also 
to some (not extremely large) differences in their perco- 
lation characteristics. We hasten to note that our simple 
models neglect many important aspects of real networks 
like geography [33,34] but stress the importance to con- 
sider the higher correlations in the degrees of connected 
nodes. 



TOMOGRAPHY OF THE NETWORKS 

Referring to spreading of computer viruses or human 
diseases, it is necessary to know how many sites get in- 
fected on each step of the infection propagation. Thus, 
we examine the local structure in the network. Cohen 
et al. [35] examined the shells around the node with the 
highest degree in the network. In our study we start from 
a node chosen at random. This initial node (the root) is 
assigned to shell number 0. Then all links starting at 
this node are followed. All nodes reached are assigned to 
shell number 1. Then all links leaving a node in shell 1 
are followed and all nodes reached that don't belong to 
previous shells are labelled as nodes of shell 2. The same 
is carried out for shell 2 etc., until the whole network 
is exhausted. We then get N^ r , the number of nodes 
in shell I for root r. The whole procedure is repeated 
starting at all N nodes in the network, giving Pi(k), the 
degree distribution in shell I. We define Pi(k) as: 



Pi{k) = 



(2) 



We are most interested in the average degree (k)i = 
^2kkPi(k) of nodes of the shell I. In the epidemiologi- 
cal context, this quantity can be interpreted as a disease 
multiplication factor after I steps of propagation. It de- 
scribes how many neighbors a node can infect on average. 
Note that such a definition of Pi (k) gives us for the degree 
distribution in the first shell: 



Pi(k) = 



kN h 
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where P{k) and N k are the degree distribution and the 
number of nodes with degree k in the network respec- 
tively. We bear in mind that every link in the network is 
followed exactly once in each direction. Hence, we find 
that every node with degree k is counted exactly k times. 
From Eq.(3) follows that (fc)i = (k 2 )/(k). This quatity, 
that plays a very important role in the percolation the- 
ory of networks [36] , depends only on the first and second 
moment of the degree distribution, but not on the corre- 
lations. Of course Po(k) = P(k). 

Note that as N — > oo we have (k 2 ) — > oo: for our scale- 
free constructions the mean degree in shell 1 depends sig- 
nificantly on the network size determining the cutoff in 
the degree distribution. However, the values of (k)i are 
the same for all three models: The first two shells are 
determined only by the degree distributions. In all other 
shells the three models differ. For the LR (configuration) 
model one finds for all shells in the thermodynamic limit 
Pi(k) = P\(k). However, since these distributions do not 
possess finite means, the values of {k)i are governed by 
the finite-size cutoff, which is different in different shells, 
since the network is practically exhausted within the first 
few steps, see Fig. I. 

In what follows we compare the shell structure of the 
BA, the LR and the NR models. We discuss in detail the 
networks based on the BA-construction with m = 2. For 
larger m the same qualitative results were observed. In 
the present work we refrain from discussion of a peculiar 
case m = 1. For m = 1 the topology of the B A- model 
is distinct from one for m > 2 since in this case the net- 
work is a tree. This connected tree is destroyed by the 
randomization procedure and is transformed into a set of 
disconnected clusters. On the other hand, for m > 2 the 
creation of large separate clusters under randomization 
is rather unprobable, so that most of the nodes stay con- 
nected. Fig. 1 shows (k) as a function of the shell number 
I. Panel (a) corresponds to the BA model, panel (b) to 
the LR model, and panel (c) to the NR model. The dif- 
ferent curves show simulations for different network sizes: 
N = 3, 000; N = 10, 000; N = 30, 000; and N = 100, 000. 
All points are averaged over ten different realizations ex- 
cept for those for networks of 100,000 nodes with only one 
simulation. In panel (d) we compare the shell-structure 
for all three models at N = 30, 000. The most significant 
feature of the graphs is the difference in (fc) 2 . In the BA 
and LR models the maximum is reached in the first shell, 
while for the NR model the maximum is reached only in 
the second shell: (k) 2 ,BA < {k)2,LR < (k)2,NR- This ef- 
fect becomes more pronounced with increasing network 
size. In shells with large / for all networks mostly nodes 
with the lowest degree 2 are found. 

The inset in graph (a) of Fig. 1 shows the relation 
between average age rj of nodes with connectivy k in the 
network as a function of their degree for the BA model. 
The age of a node n and of any of its proper links is de- 
fined as Tj(n) = (N — t n )/N where t n denotes the time 
of birth of the node. For the randomized LR and NR 



2 



models age has no meaning. The figure shows a strong 
correlation between age and degree of a node. The rea- 
sons for these strong correlations are as follows: First, 
older nodes experienced more time-steps than younger 
ones and thus have larger probability to acquire non- 
proper bonds. Moreover, at earlier times there are less 
nodes in the network, so that the probability of acquiring 
a new link per time step for an individual node is even 
higher. Third, at later time-steps older nodes already 
tend to have higher degrees than younger ones, so the 
probability for them to acquire new links is considerably 
larger due to preferential attachment. The correlations 
between the age and the degree bring some nontrivial 
aspects into the BA model based on growth, which are 



erased when randomizing the network. 

Let us discuss the degree distribution in the second 
shell. In this case we find as that every link leaving a 
node of degree k is counted k — 1 times. Let P(l\k) be 
a probability that a link leaving a node of degree k en- 
ters a node with degree I. Neglecting the possibility of 
short loops (which is always appropriate in the thermo- 
dynamical limit N — > oo) and the inherent direction of 
links (which may be not totally appropriate for the BA- 
model) we have: 



P2(l) 



E fc fcP(fc)(fc-i)pq|fc) 
£^p(fc)(fc-i) ' 



(4) 




I I 

FIG. 1. Mean degree value (fc) in shell I: (a) for the BA-model, (b) for the LR-model, (c) for the NR-model. Different curves 
correspond to different network sizes: from top to bottom 100,000; 30,000; 10,000; 3,000 nodes. 10 simulations were done for 
each value except for the shells with I > 2 at N = 100, 000 based on only one simulation. Panel (d) compares the tomography 
of the models with N = 30, 000: from top to bottom NR model; LR model; BA model. The inset in panel (a) shows the average 
age T) of a node as a function of its degree k. 
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The value of (k} 2 gives important information about 
the type of mixing in the network. To study mixing in 
networks one needs to divide the nodes into groups with 
identical properties. The only relevant characteristics of 
the nodes that is present in all three models, is their 
degree. Thus, we can examine the degree-correlations 
between neighboring nodes, which we compare with the 
uncorrelated LR model, where the probability that a link 
connects to a node with a certain degree is independent 
from whatever is attached to the other end of the link: 
P(k\l) = kP(k)/(k) = kP(k)/2m. All other relations 
would correspond to assortative or disassortative mix- 
ing. Qualitatively, assortativity then means that nodes 
attach to nodes with similar degree more likely than in 
the LR-model: P{k\l) > P(k\l) LR = kP{k)/{k)iov fc « Z. 
Dissortativity means that nodes attach to nodes with 
very different degree more likely than in the LR-model: 
P{k\l) > kP{k)/(k) for k > I or I > k. Inserting this 
in Eq.(4), and calculating the mean, one finds qualita- 
tively that (fc)i = (k) 2 ,LR < (k)2 for assortativity, and 
(fc)i > (k) 2 for dissortativity. 

In the following we show where the correlations of the 
BA and NR model originate. A consequence of the BA- 
algorithm is that there are two different types of ends for 
the links. Each node has exactly m proper links attached 
to it at the moment of its birth and a certain number of 
links that are attached later. Since each node receives 
the same number of links at its birth, towards the proper 
nodes a link encounters a node with degree k with proba- 
bility P(k). To compensate for this, in the other direction 
a node with degree k is encountered with the probability 
(fc ""^ P(fc) = 2^ - P(k), so that both distributions 

together yield kP(k)/(k). On one end of the link nodes 
with small degree are predominant: P(k) < kP(k) / (k) 
for small k. On the other end nodes with high degree are 
predominant: (k — m)P{k)/m > kP{k)/2m for k large. 
This corresponds to dissortativity. Actually the situation 
is somewhat more complex since in the BA model these 
probability distributions also depend on the age of the 
link. 

Assortativity of the NR model is a result of the node- 
randomizing process. Since the nodes with smaller de- 
gree are predominant in the node population, those links 
are preferably chosen that have on the end with the 
randomly chosen node a node with a smaller degree 
(P(k) > kP(k) I (k) for k small). Then the randomization 
algorithm exchanges the links and connects those nodes 
to each other. This leads to assortativity for nodes with 
small degree, which is compensated by assortativity for 
nodes with high degree. 



PERCOLATION 

Percolation properties of networks are relevant when 
discussing their vulnerability to attack or immunization 
which removes nodes or links from the network. For 



scale-free networks random percolation as well as vulner- 
ability to a deliberate attack have been studied by several 
groups [36-40]. One considers the removal of a certain 
fraction of edges or nodes in a network. Our simulations 
correspond to the node removal model; q is the fraction of 
removed nodes. Below the percolation threshold q < q c 
a giant component (infinite cluster) exists, which ceases 
to exist above the threshold. A giant component, and 
consequently q c is exactly defined only in the thermody- 
namic limit N — > oo: it is a cluster, to which a nonzero 
fraction of all nodes belongs. 

In [32] and [36] a condition for the percolation transi- 
tion in random networks has been discussed: Every node 
already connected to the spanning cluster is connected 
to at least one new node. Ref. [36] gives the following 
percolation criterion for the configuration model: 



where the means correspond to an unperturbed network 
(q = 0). For networks with degree distribution Eq.(l), 
(k 2 ) diverges as N — > oo. This yields for the random 
networks with a such degree distribution a percolation 
threshold q c = 1 in the thermodinamic limit, indepen- 
dent of the minimal degree m; in the epidemiological 
terms this corresponds to the absence of herd immuni- 
ties in such systems. Crucial for this threshold is the 
power-law tail of the degree distribution with an expo- 
nent < 3. Moreover, Ref. [37] shows that the critical 
exponent (3 governing the fraction of nodes of the 
giant component, oc (q c — q) 13 , diverges as the expo- 
nent of the degree distribution approaches —3. Therefore 
Moo approaches zero with zero slope as q — ► 1. 

In Fig. 2 we plotted for the three models discussed 
Mqo as a function of q. The behavior of all three models 
for a network size of 300, 000 nodes is presented in panel 
(a). In the inset the size of the giant component was 
measured in relation to the number of nodes remaining 
in the network (1— q)N and not to their initial number N. 
Other panels show the percolation behavior of each of the 
models at different network sizes: Panel (b) corresponds 
to the BA model, (c) to the LR model, and (d) to the NR 
model. For the largest networks with TV = 300, 000 nodes 
we calculated 5 realizations for each model, for those with 
30, 000; 10, 000; and 3, 000 nodes averaging over 10 real- 
ization was performed. For all three models within the 
error bars the curves at different network sizes coincide. 
This shows that even the smallest network is already close 
to the thermodynamical limit. R. Albert et al. have 
found a similar behavior in a study of B A- networks [38] . 
They analyze networks of sizes N = 1000, 5000 and 20000 
concluding "that the overall clustering scenario and the 
value of the critical point is independent of the size of 
the system". 

In the simulations we find two regimes: for moder- 
ate q we find, that the sizes of the giant components 
of the BA, LR, and NR model obey the inequalities 
Moo.ba > Moo.lh > M^.nr , while for q close to 
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unity the inequalities are reverted: M^^ba < M^^r < 
A/qo jvfl. However, in this regime the differences between 
Moo^ba, Mqo,lr and M^.nr are subtle and hardly re- 
solved on the scales of Fig. 2. We note that similar 
situation was observed in Rcf. [17]. However, there the 
size of the giant cluster was measured not as a function 
of q but of a scaling parameter in the degree distribution. 

The observed effects can be explained by the corre- 
lations in the network. For q = one has M^^ba = 
Moo. lr = ^oo. nr- Now, the probability that single 
nodes loose their connection to the giant cluster depends 
only on the degree distribution, and not on correlations. 
So, the difference in the must be explained by the 
break-off of clusters containing more than one node. The 



probability for such an event is smaller in the BA than 
in the LR model, since dissortativity implies that one 
finds fewer 'regions', where only nodes with low degree 
are present. 

However, when we get to the region of large q, as nodes 
with low degree act as 'bridges' between the nodes with 
high degree, the connections between the nodes with high 
degree are weaker in the case of the BA model than in 
the case of the LR model. So, the probability that nodes 
with high degree break off is higher for the BA model 
than for the LR model. There is no robust core of high- 
degree nodes in the network [17]. The correlation effects 
for the NR model, when compared with the LR model, 
are opposite to those for the BA model. 




FIG. 2. Fraction of nodes Moo in the giant component depending on the fraction q of nodes removed from the network: (b) 
for the BA-model, (c) for the LR-model, (d) for the NR-model. Different curves correspond to different network sizes: from 
top to bottom 300,000 (5 simulations); 30,000; 10,000; 3,000 nodes (10 simulations each). Graph (a) compares all three models 
at N = 300,000 (from top to bottom: BA-model, LR-model, NR-model). The inset shows the fraction Moo of the number of 
nodes in the giant component relative to the remaining number of nodes in the network (1 — q)N . 
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CONCLUSION 

We consider three different models of scale-free net- 
works: the genuine Barabasi- Albert construction based 
on growth and preferential attachment, and two networks 
emerging when randomizing it with respect to links or 
nodes. We point out that the BA model shows dissor- 
tative behavior with respect to the nodes' degrees, while 
the node-randomized network shows assortative mixing. 
However, these strong differences in the shell structure 
lead only to moderate quantitative difference in the per- 
colation behavior of the networks. 
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